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1  Introduction 


Suppose  that  an  observation  X  is  obtained  from  a  distribution  with  density 

f{x\6)  =  c{6)  exp{0a;}/i(a;),  — oo  <  a  <  x  <  b  <  +oo,  (1) 

where  h{x)  >  0  for  a;  e  (a,  6)  and  h{x)  is  bounded  from  below  on  any  compact  set  of  {a,b). 
The  parameter  6  is  distributed  according  to  an  unknown  and  unspecified  prior  G  on  the 
parameter  space  a  subset  of  the  natural  parameter  space  =  {^  :  c{6)  >  0}. 

Suppose  that  one  wants  to  estimate  9  after  observing  X  =  x.  Under  the  squared  error 
loss,  the  Bayes  estimator  is  given  by  =  E[d\X  =  x].  It  can  be  computed  if  G  is  known. 

In  situations  where  G  is  unknown,  this  solution  cannot  work.  A  solution  to  these  situations 
is  to  apply  the  empirical  Bayes  approach  to  construct  an  empirical  Bayes  estimator.  This 
approach  assumes  that  n  independent  past  observations  Xi,  •  •  •,  X„  are  available.  Thus  an 
estimator  of  9  can  be  constructed  based  on  Xi,  •  •  -,  X„  and  X  —  x.  The  estimator  is  called 
the  empirical  Bayes  estimator,  and  denoted  by  0„(rr,  Xi,  •  •  • ,  X„)  =  <^„(a:)  =  <^„.  The  Bayes 
risk  R{<i>Gi  G)  is 

R(>l>o,  G)  =  f  I  (M^)  -  «?nx\e)dxda^o).  (2) 

The  overall  risk  of  denoted  by  G),  is 

R{<t>n,  G)  =  E[I  I {(t>n{x)  -  9ff{x\9)dxdG{9)].  (3) 

R{<f>n,G)  —  R{(j)G,G),  the  difference  of  the  (overall)  risk  of  (pn  and  the  Bayes  risk,  is  call  the 
regret  of  the  estimator  (pn  and  used  to  measure  the  performance  of  (pn- 

The  above  estimation  problem  has  been  considered  by  many  authors.  (See  Lin  (1975), 
Singh  (1976),  Singh  (1979),  Pensky  (1998)  and  the  references  listed  there.  )  Singh  (1979) 
significantly  improved  the  previous  results  in  terms  of  the  rate  of  convergence.  He  constructed 
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an  empirical  Bayes  estimator  and  showed  that  the  estimator  has  a  rate  of  convergence  of 
Pensky  (1998)  applied  advanced  wavelet  techniques  to  construct  empirical 
Bayes  estimators  and  obtained  a  better  rate. 

For  this  empirical  Bayes  estimation  problem,  a  natural  question  arises:  what  is  the  best 
possible  rate  ?  To  answer  this  question,  a  minimax  lower  bound  of  empirical  Bayes  estimators 
is  derived  and  it  is  shown  that  the  best  possible  rate  is  0(l/n)  if  0  is  distributed  within  a 
compact  (bounded)  set. 

Also  we  shall  construct  an  empirical  Bayes  estimator  using  kernel  sequence  method.  The 
kernel  sequence  method  enables  us  to  use  the  Coo-smoothness  of  a^x)  and  i[jg{x)  (defined 
in  Section  2).  Thus  improved  estimators  of  otoix)  and  tpoix)  are  obtained.  Based  on  these 
estimators,  we  construct  an  empirical  Bayes  estimator  (pn{x)  and  show  that  (f>n{x)  has  a  rate 
of  convergence  of  0(n“^(lnn)^(lnlnn)^)  under  the  assumption  of  C  [01,6*2]  C  Qq- 

This  paper  is  organized  as  follows:  a  minimax  lower  bound  is  derived  in  Section  2  by 
converting  the  global  problem  into  a  local  problem,  identifying  the  Bayes  estimator  as  a 
functional  of  the  marginal  density  of  X,  and  constructing  the  hardest  two-point  subproblem. 
The  construction  of  the  estimator  ^„(a:)  is  presented  in  Section  3  and  its  performance  is  also 
studied  there.  In  Section  4,  we  present  a  few  examples,  which  include  three  examples  used 
in  Singh  (1979)  and  the  comparisons  of  our  results  with  his.  The  proofs  are  given  in  Section 
5.  In  Section  6,  we  summarize  our  results  and  make  some  comparisons  with  the  results 
published  recently  in  the  literature. 

Finally,  the  readers  may  refer  to  Robbins  (1956,  1964)  to  learn  more  about  the  empirical 
Bayes  approach.  As  for  applications  of  the  empirical  Bayes  estimation,  one  may  see  Bendel 
and  Carlin  (1990),  Louis  (1991),  Desouza  (1991),  Mollie  and  Richardson  (1991),  Norberg 
(1989),  Lahiri  and  Park  (1991),  Chen  and  Singpurwalla  (1996)  and  Pensky  and  Singh  (1999). 
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2  Lower  Bound  of  Empirical  Bayes  Estimators 


We  shall  obtain  a  minimax  lower  bound  for  empirical  Bayes  estimators.  This  will  show  that 
the  best  possible  rate  for  any  empirical  Bayes  estimator  is  0(l/n). 

2.1  Conversion  to  a  Local  Problem 


Under  the  squared  error  loss,  the  Bayes  estimator  (I)g{x)  is  the  posterior  mean  of  6  given 
X  =  X.  Simple  calculations  show  that 


(j>G{x)  —  E[9\X  —  x]  = 


f  9c(9)  exp(dx)dG(0) 
f  c{0)  exp{0x)dG{6) 


(4) 


Let  aG{x)  =  f  c(d)  exp(9x)dG(9)  and  iPg(x)  =  f  dc(d)  exp(9x)dG(d).  Then  the  Bayes  esti¬ 
mator  of  9  can  be  written  as 


<Pg(x)  = 


oig{x) 


(5) 


Suppose  that  the  prior  G  has  a  compact  support  [^1,^2]  or  its  support  belongs  to  the  compact 
set  [^1,02]-  Let  Q  be  the  class  of  this  type  of  priors,  i.e.. 


g  =  ^G  :  G  has  the  support  Q  C  [^1, 02]  C  :  c{9)  >  0}|.  (6) 


Suppose  that  (j)n{x)  is  an  empirical  Bayes  estimator  based  on  past  data  (Xi,  X2,  •  •  • ,  X„)  and 
the  present  data  X  =  x.  Let  $  be  the  class  of  empirical  Bayes  estimators  of  type  4>n-  We 
are  interested  in  a  lower  bound  of 


inf  sup[i?(</>„,G)  -  i?((/)G,C)]. 

4>ne^  GeQ 


(7) 


For  any  empirical  Bayes  estimator,  Singh  (1979)  proved  that 


R{(l>n,G)  -  R{(j>G,G)  =  l[E{(f)n{x)  -  (l>G{x)Y]aG{x)h{x)dx. 


(8) 
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Therefore  we  have 


inf  s\ip[R{(l>n,  G)  -  R{<t>G,  G)]  (9) 

=  inf  sup  /  [E{^n{x)  -  ^G{x)f](^G{x)K^)dx- 
G eg  J 

The  RHS  of  the  above  equation  is  a  global  minimax  lower  bound  of  empirical  Bayes  estima¬ 
tors  instead  of  a  local  minimax  lower  bound  of 

inf  s\ip[E{<f>n{x)  -  (j)G{x))‘^]  for  some  fixed  x.  (10) 

<t>ne^  Geg 

So  first  we  need  to  convert  the  global  minimax  problem  into  a  local  minimax  problem.  For 
this  purpose,  we  focus  on  the  supremum  of  the  regret  over  two  prior  distributions  and  use  the 
idea  that  the  supremum  of  two  positive  numbers  is  larger  than  the  half  of  the  sum  and  the 
sum  is  larger  than  the  supremum.  Then  we  are  able  to  move  the  “sup”  into  the  integration  in 
(9).  By  further  moving  the  “inf”  into  the  integration,  a  global  minimax  problem  is  changed 
into  a  point-wise  (local)  problem. 


Lemma  2.1  says  that  we  can  find  (j)minimax{x)  locally  for  each  x  and  then  obtain  the  global 
lower  bound  by  integration.  The  proof  of  Lemma  2.1  is  given  in  Section  5. 
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2.2  A  Functional  of  the  Marginal  Density  of  X 


Next  we  need  to  find  (j)minimax{‘^)  for  each  x.  This  is  done  by  considering  the  estimate  of  a 
functional  of  the  marginal  density  of  X  and  constructing  the  hardest  two-point  subproblem 


associated  with  it. 

Let  fai^)  —  I  f{x\^)dG{9)  be  the  marginal  density  of  X.  Then  foix)  =  aG{x)h{x). 
Assume  that  h'{x)  exists  for  x  G  (a,b).  Then 

faix)  h'(x) 


(1>g{x)  = 


(13) 


faix)  h{x)  ■ 

For  a  fixed  x,  since  h{x)  is  known,  the  RHS  of  the  above  equation  is  a  functional  of  faix). 
Let  Txfc  denote  this  functional,  i.e., 

_  faix)  h'{x) 


Txfc  = 


=  (f>G{x) 


(14) 


/g(x)  h{x) 

We  have  expressed  the  Bayes  estimator  (^g(^)  as  a  functional  of  fa  as  above.  To  find  is 
to  estimate  the  functional  Txfc  of  /g  based  on  a  sample  from  fa-  Therefore  we  apply  the 
results  in  Donoho  and  Liu  (1991)  and  obtain  the  following  lemma. 


Lemma  2.2.  Assume  that  h'(x)  exists  for  x  G  (a,  6).  For  any  Gi  and  G2  G  G,  let  fci 
and  /g2  be  the  corresponding  marginal  densities  of  X.  If  for  some  constant  C  >  0, 

j [\//gi(x)  -  \lfG2{x)fdx  <  (15) 

Then  for  all  x  G  (a,  6),  <Pminimax{x)  defined  by  (11)  satisfies 

^minimax(x)  —  ^l[0Gl(x)  <^G2(x)]  ,  (f^) 

where  Zi  >  0  is  a  constant  and  independent  of  x. 


Combining  Lemma  2.1  and  Lemma  2.2,  we  have:  for  some  Z2  >  0 

inf  snp[R{(f)n,G)  -  R{(f>G,G)] 

<t>ne^Geg 


(17) 
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>  ^2  J -  (f>G2{oc)f[(^Gi{x)  A  aG2{x)]h{x)dx 

for  any  Gi  and  G2  in  G  subject  to 

J {\JfGi{x)  -  ^JfG2{x)fdx  <  ^  (18) 

for  some  C  >  0. 


2.3  A  Lower  Bound 


In  the  following  we  shall  construct  suitable  Gi  and  G2  in  G  such  that  a  desired  lower  bound 
of  4>minimax{x)  Can  be  obtained.  Choose  Xq  G  {a,b).  Let 

go{e)  =  mo[c{e)]~^I[e^<e<e2]  (19) 

and 


g^{e)  =  miexi>{9xo)go{9) 

where  mi  and  m2  are  normalizing  constants.  Denote 

„  Vngi{9)  +  go{9) 

=  — i+vs  ■ 

Clearly,  go,  gi  and  g2  are  prior  densities  with  their  cdf’s  in  G- 


(20) 


(21) 


Lemma  2.3.  Let  gi  and  g2  be  defined  as  (20)  and  (21).  Let  fi  and  f2  be  the  marginal 
densities  of  X  corresponding  to  the  prior  density  g  =  g\  and  g  =  g2.  Let  00(3^))  <yi{x)  and 
(y2{x)  be  the  function  aG{x)  corresponding  to  the  prior  density  g  =  go,  g  =  gi  and  g  =  g2. 
Let  ipoix),  i’lix)  and  i)2{x)  be  the  function  iPg{x)  corresponding  to  the  prior  density  g  —  go, 
g  —  gi  and  g  —  g2-  Then  for  some  constant  C  >  0 

f  {\fTi  -  yfjifdx  <  ~  (22) 
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and  for  all  x  G  (a,  b) 


{(f>i{x)  -  (f)2{x))  >  -  X 


where  is  a  constant  independent  of  x. 


al{x) 


(23) 


The  proof  of  Lemma  2.3  is  given  in  Section  5.  Under  the  assumption  of  Lemma  2.3,  it 
follows  from  (17)  and  (18)  that 


inf  sup[(i?(</»„,  G)  -  Ri4>G,  G))^] 
<l>neiGeg 


(24) 


>  - 
n 


where 

2  Jxi  oco\^) 

and  [x\,X2\  is  a  compact  subset  of  (a,  6). 


<  oo, 


(25) 


Theorem  2.1.  Assume  that  h'{x)  exists.  Then  the  best  possible  rate  of  empirical  Bayes 
estimators  is  0{l/n),  i.e.,  for  some  /  >  0 

I 


inf  sup[i7(0„,  G)  -  R{<f>G,  G)]  > 
</>„€$  GeQ  ^ 


(26) 


Note  that  in  the  above  theorem,  we  have  proved  that  no  empirical  Bayes  estimator  can 
have  a  rate  faster  than  0(l/n)  if  6  is  distributed  within  a  compact  (bounded)  set.  In  the 
next  section,  we  shall  construct  an  empirical  Bayes  estimator  with  rate  which  is  close  to  this 
minimax  lower  bound  rate  0(l/n). 
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3  Construction  of  an  Empirical  Bayes  Estimator  with  Rate  Close 


to  the  Lower  Bound  Rate 


In  this  section,  we  shall  construct  an  empirical  Bayes  estimator  under  the  assumption  Q  C 
\di ,  ^2]  aiid  then  show  the  estimator  has  a  rate  much  closer  to  the  best  possible  rate  obtained 
in  Section  2  than  any  other  estimators  appeared  in  the  literature  under  the  same  assumptions. 

Note  that  the  Bayes  estimator  <j)G{x)  =  'iPg{x) / aG{x) ,  which  is  the  ratio  of  two  unknown 
functions.  So  we  first  estimate  both  unknown  functions  oig{x)  and  -ipGix)-  Then  construct 
an  estimator  of  a  ratio  based  on  the  estimators  of  the  numerator  and  denominator. 

We  apply  the  kernel  sequence  method  to  construct  the  estimators  of  aG{x)  and  ipGix)- 
The  idea  of  the  kernel  sequence  method  is  to  use  a  sequence  of  kernel  functions  and  let  the 
kernel  functions  and  window  bandwidths  vary  simultaneously  to  obtain  good  estimators. 
This  idea  has  been  used  in  Gupta  and  Li  (2001)  for  constructing  an  empirical  Bayes  test  for 
the  exponential  family.  It  will  be  used  here  again. 

3.1  Construction  of  an  Estimator 


We  have  defined  two  kernel  sequences  in  Gupta  and  Li  (2001)  where  we  construct  the  em¬ 
pirical  Bayes  test  for  the  exponential  family.  Unfortunately,  they  are  not  good  choices  for 
this  estimation  problem.  So  we  have  to  define  two  different  kernel  sequences. 

For  m  >  1,  let 


Komiy^ 


VryiV"^  +  Vm-iy”"  ^  +  ’  '  '  +  Po,  if  0  <  ?/  <  1, 

0,  otherwise. 


(27) 


where  for  0  <  s  <  m 


(— l)^(m  -f-  l)(m  +  s  -I- 1)! 
(s  -t-  l)!s!(m  —  s)! 


(28) 
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and  let 


KiM  = 


Qm.y'^  +  Qm-iy^  ^  H - !■  90) 

0, 


if  0  <  j/  <  1, 
otherwise, 


where  for  0  <  s  <  m 


(29) 


(— +  2)!(s  +  m  +  1)! 
(m  —  l)!s!(s  +  2)s!(m  —  s)! 


(30) 


In  Section  5  of  this  paper,  we  shall  prove  that 


and 


ri  .  f  1  if  j  =  0, 

I  y^Kom{y)dy  =  < 

°  0  if  i  =  1,2,- • -,771, 


j  =  0,2,3,  ••  •,m, 

j  =  1- 


(31) 


(32) 


So  kom{y)  and  kim{y)  are  the  kernels  with  index  m.  Kom{y)  will  be  used  to  estimate  0:0(3;) 
and  Kim{y)  will  be  used  to  estimate  Clearly, 


j {Kom{y)fdy  =  j {Komiy)){pmy'^  +  Pm-iy”"  ^  •  •+Po)dy  (33) 

=  Po 


and 


j \Kam{y)\dt  <  ( j  =  Pa^- 


(34) 


Similarly, 


j{Ki„{y)fdy  =  qi 


(35) 


and 


I  \Kim{y)\dy  < 


(36) 
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Now  we  consider  the  following  three  cases  of  (1): 


(/)  a  —  -oo,  b  =  oo,  h{x)  |  0  as  a;  I  “OO  and  a;  t  oo; 

{II)  a  =  0,  b  =  oo,  h{x)  t  oo  or  h{x)  ->  4i  >  0  as  a;  f  oo  and 

(*)  h{x)  — >■  ^2  >  0  or  h{x)  4-  0  as  a;  4-  0; 

{III)  a  =  0,  6  =  1,  /i(a;) ->■  4i  >  0  or  h(a:)  4  0  as  a;  I  0  and 

h{x)  ->  ^2  >  0  or  h{x)  4-  0  as  a;  1 1, 

Note  that  (*)  includes  the  most  common  exponential  family  distributions. 

Let  u  =  Un  =  l/(lnlnn  V  1),  v  —  Vn  =  [z^]  V  0  +  1,  where  [a:]  denotes  the  integer  part 
of  X.  In  case  I,  define  for  x  G  (— oo,  0), 


Q,  fa;')  —  X 

—  „„  X,=l  hixA  1 


and  for  x  G  [0,  oo) 


_ _ 3, 

Wn\X)  —  h^^Xj)  ’ 


O^nKX)  —  Xj=l  ft(Xj)  ) 


'ipn{x) 

In  Case  (II),  define  for  a;  G  (0, 00) 


1  (  „  ^  ) 

nu2  Xj=l  h(Xj)  ’ 


q;  =  X  y*? 

nu  ■^j=l  h{Xj)  ’ 


JC  •  —2; 

,/,  ('.v.'v  _ 

Wn{X)  —  „„2  Xj=i  h{Xj)  ’ 


In  Case  (III),  define  for  a:  G  (0, 1/2) 


a„(x)  =  E?=, 


(37) 


(38) 


(39) 


(40) 


nu2  ^j=l  h(Xj) 
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and  for  x  G  [1/2, 1) 


x-Xi 


Q,  (^)  —  JLyn  ^ovi—^) 

—  nu  h{Xj)  > 


(41) 


„/.  _  _J^  yn 

Wn\-i')  —  „u2  2^7=1  f^fxA  ) 


x-Xi 

-j=i  hiXjT 

Lemma  3.1  below  says  that  an{x)  and  -ipnix)  are  consistent  estimators  of  acix)  and  ^g(^)- 


Note  that  0i  <  9  <  O2.  Therefore  we  propose  an  empirical  Bayes  estimator  of  ^  as 

V  ^1)  A  02. 


(42) 


3.2  Rate  of  Convergence  of  the  Estimator 


First  we  investigate  the  rate  of  convergence  of  On  and  For  the  distributions  of  Case  I, 
Case  II  and  Case  III,  we  see  that  h{x)  is  either  bounded  from  below  or  monotone  in  (a,  Oq] 
and  [bo,b)  for  some  Oq  and  bo-  Let 

h{x)  =  h{x)  A  min{/i(x)  :  x  G  [uoi  ^>o]}-  (43) 

As  a  result  of  the  kernel  sequence  estimation,  we  have  the  following  lemma. 

Lemma  3.1.  an{x)  and  tpnix)  defined  in  (S7)-(41)  have  the  following  properties: 

l£'[a„(a:)]  -  00(3;) |  <  Cipl^^u^acix),  Var[an{x)]  <  ,  (44) 

Tl/UihyX  j 

and 

\E[ip„{x)]  -  iPg{x)\  <  Ciql^^u"aG{x),  Far[^„(a:)]  <  (45) 

where  Cy,  C2  are  constants  and  independent  of  x  and  G. 

From  Lemma  3.1,  we  see  that  the  mean  square  errors  of  an{x)  and  V’n(^)  are  of  order 
^(to)  kernel  sequence  method. 
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The  following  two  lemmas  are  necessary  to  compute  the  convergence  rate  of  The  first 


one  gives  a  bound  on  the  mean  squared  error  of  (/>„. 


Lemma  3.2.  For  any  0  <  r  <  1 

E\\(l)n{x)  -  (I>g{x)\^]  <  C3aa^^{x)[{\E[an{x)]  -  +  {Far[Q;„(a;)]}'‘] 

+C4aQ^''{x)[{\E[i)n{x)]  -  'ipai^W  +  {Var[i)n{x)]Y], 


where  Cz,  C4  are  constants  and  independent  of  x,  r  and  G. 


Lemma  3.3.  Recall  foix)  =  f  f(xld)dG(0).  For  any  0  <  r  <  1 

€5/(1  —  r)  for  Case  I  and  II, 


/  [fGix)Y' 


^dx  <  < 


(46) 


C5  for  Case  III, 

where  C5  is  a  constant  and  independent  of  r  and  G. 


The  proofs  of  Lemma  3.1  and  Lemma  3.3  are  in  Section  5  and  Lemma  3.2  is  a  modified 
version  of  Lemma  2.1  in  Singh  (1979).  From  Lemma  3.1  and  3.2,  we  have 

R{<f)n,G)  -  R{(t>G,G)  =  j  E[\(f)n{x)  -  (l)G{x)\‘^]o(G{x)h{x)dx  (47) 

<  cz{cipY^u'’Y''  +  Cz{c2Po{nu)~^y  J  {x)h{x)br'' {x)dx 

+C4(ci9i'^^w")^’'  +C4{c2qi{nu^)~^Y  j  {x)h{x)hr'^ {x)dx 
Note  that  h{x)  >  CQh{x)  for  some  constant  Ce  >  0.  Then 

j  {x)h{x)hr^ {x)dx  <  Cg  ^  j [aG{x)h{x)f'~^dx  =  Cg  ^  j [fG(x)y'~''dx.  (48) 

Also  note  that  u"  <  Po  —  {v  +  1)^  and  =  [u(t)  +  l)(v  +  2)]^/3.  Then  for  Case  I  and  II 
R{(l>n,G)  -  R{<1>g,G)  =  cr  ■  <  C7fl^(lnlnn)2,  (49) 

where  C7  =  (cs  +  C4)(ci  +  02)05/ {Scq)  and  r  =  1  -  In  Inn/ Inn,  and  for  Case  III 

R{^„.  G)  -  R{^a,G)  =  C7  ■  <  C7l!^^{lnlnIl)^  (50) 

nu^  n 
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Then  we  have  the  following  theorem. 


Theorem  3.1.  Let  the  distribution  of  X  belong  to  one  of  three  cases  defined  by  (*).  If 
the  support  of  the  prior  G  is  within  a  compact  set  [^1,^2]  C  then  the  empirical  Bayes 
estimator  (/»„(a;)  defined  by  (42)  has  a  rate  of  convergence  of  0{n~^{\iiny{\n\nnf)  for  the 
distributions  in  Case  I  and  Case  II,  and  has  a  rate  of  convergence  of  0{n~^{\nnf{\n\nnY) 
for  the  distributions  in  Case  III. 

Corollary  3.1.  Under  the  assumption  of  Theorem  3.1  and  Let  Q  be  the  set  of  prior 
distributions  defined  by  (6).  Then  for  the  distributions  in  Case  I  and  II 

sup[i?((^„,G)  -  R{(f)G,G)]  =  0(n“Hlnn)^(lnlnn)^)  (51) 

GeG 

and  for  the  distributions  in  Case  III 

sup[i?((/>n,G)  -  R{(I>g,G)]  =  0(n'^ (Inn)® (In In n)^).  (52) 

GeG 

From  Theorem  2.1,  we  know  that  0(l/n)  is  the  best  possible  rate.  Prom  Corollary  3.1,  we 
see  that  <?!»„  constructed  by  (42)  has  a  rate  close  to  0{l/n).  Comparing  the  previous  results 
in  the  literature,  the  rate  in  Corollary  3.1  is  the  fastest  one  under  the  same  assumptions. 
See  Section  6  for  details  on  comparisons. 

4  Examples 

We  shall  present  a  few  examples  in  this  section.  The  first  three  are  from  Singh  (1979). 
Another  example  is  used  to  illustrate  the  application  of  the  empirical  Bayes  rule  for  the 
distribution  in  Case  III.  So  a  brief  comparison  of  our  results  with  the  results  published  in 
the  literature  will  be  presented.  For  a  more  comprehensive  comparison,  see  Section  6. 
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Example  1.  {Normal  {0,  l)-family  ).  Suppose  that  X  is  a  normal  random  variable  with 
density 

f{x\9)  =  (27r)~^^^exp(-^^/2)exp(0a;)exp(-a;^/2),  -oo  <  x  <  oo. 

Here  the  natural  parameter  space  Oq  =  (—00,00).  If  0  is  bounded,  i.e.  if  |0|  <  0o,  then 
of  (42)  with  01  =  —00  and  02  =  0o  has  a  rate  of  convergence  of  0(n”^(lnn)^(lnlnn)^).  Note 
that  the  rate  of  Singh’s  estimator  is  close  to  for  r  >  1.  So  under  the  same 

assumption,  our  rate  is  faster. 

Example  2.  {Gamma  {0,  s)- family  for  s  >  1).  Suppose  that  X  is  a  gamma  random 
variable  with  density 

f{x\0)  =  {r{s))~''-{-0)~^  exi){0x)x^~''-,  X  >  0,  s  >  1. 

Here  the  natural  parameter  space  Oq  =  (~oo?0).  If  —  00  <  0  <  02  <  0,  then  cfn  of  (42) 

has  a  rate  of  convergence  of  0(7i“Hlnn)^(lnlnn)2),  which  is  better  than  Singh’s  polynomial 
rate  in  Singh  (1979). 

Example  3.  {A  population  having  the  density  with  infinite  many  discontinuities  ).  Sup¬ 
pose  that  X  is  a  random  variable  with  density 

OO 

/(rrl0)  =  (-0)(exp(0)  -  1)  exp(0x)  ^(i  -t-  l)I[i<x<i+i],  x  >  0. 

1=0 

Here  the  natural  parameter  space  Clo  =  (-00, 0).  For  this  distribution,  Theorem  3.1  is 
applicable  and  our  rate  0(n"^  (In  n)^ (In  In  n)^)  is  better  than  Singh’s  rate 
under  the  assumption  that  Q.  G  [0i,02]  C  flo-  Since  h{x)  is  not  differentiable,  Pensky’s 
method  in  Pensky  (1998)  fails  to  giving  the  rate  of  convergence. 

Now  we  give  one  example  for  the  application  of  Theorem  3.1  in  Case  III  distributions. 

Example  4.  Suppose  that  X  is  a  random  variable  from  the  following  truncated  expo- 
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nential  distribution: 


f{x\d)  =  9{l  —  exp(^))  ^  exp(^a:),  0  <  a;  <  1. 

Here  the  natural  parameter  space  =  (0,  oo).  If  0  <  0  <  ^2  <  oo,  then  </>„(a;)  of  (42) 

has  a  rate  of  of  convergence  0(n“^(lnn)®(lnlnn)^). 

5  Proofs. 


Proof  of  Lemma  2.1.  For  any  Gi,  G2  €  G,  we  have 


inf  sup[i?(^„,  G)  -  R{(f>G,  G)]  >  inf  sup  [R{4>n,  G)  -  R{(f>G,  G)]  (53) 

(l>ne^G€Q  'P'»^®Ge{Gi,G2} 


Then  it  follows  that 


inf  sup  [[E{(j)n{x)  -  (f)G{x))‘^]aG{x)h{x)dx  (54) 

4>nei  GeQ  J 

>  i  ^inf^  [  j [E{^n{x)  -  (l>Gi{x)f]aGi{x)h{x)dx 

+  j [E{(l)n{x)  -  ^G2{x)f]oiG2{x)h{x)dx] 

>  ^  ^injf^  [  j [E{4>nix)  -  (l)Gi{x))^]a{x)h{x)dx 

+  j [E{4>n{x)  -  <l)G2{x))^\a{x)h{x)dx] 

1  f 

>  -  inf  /  sup  [E{(j)ri{x)  -  (j)Gi{x)Y]a{x)h{x)dx 

2<l>ne^J  g€{Gi,G2} 

1  f 

>  -  /  inf  sup  [E{(l)n{x)  —  (j)Gi{x))^]a{x)h{x)dx 

2J  <l>ne^  Ge{GuG2} 

This  completes  the  proof  of  Lemma  2.1. 


Proof  of  Lemma  2.3.  From  (21),  it  is  clear  that 

/,(x)  -  h(x)  =  (56) 

Note  that  foix)  =  mo //(*  exp(6'a:)d6>  and  fi{x)  =  mimo exp(^(a;  +  a:o))d^.  Then  there 
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exist  /i  >  0  and  h  >  0  such  that  for  all  x  €  (a,  b) 


I  <  =  //'  exp(^a;)d^  ^  ^ 

^  ~  fi{^)  Jg^  exp{9{x  +  xo))d6  “  ^ 


Therefore 


-  'Jh{x)fdx  < 

< 

< 


f  [/l(^)  -  f2ix)? 

'  fl{x) 


fojx) 

/iW 


Yfi{x)dx 


C 


n 


for  some  C  >  0.  On  the  other  hand, 


(j)2{x)  -  (f)i{x) 


y/n'4;i{x)  +  ipo{x)  _  ■ipi{x) 
^/nal  (a;)  +  cco  (a:)  cci  (a;) 
'ipo{x)ai{x)  -  ao{x)i}i{x) 
Q;i(a:)[\/na;i(a:)  +q;o] 


(56) 


(57) 


(58) 


Since 


Q!o(a;) 

ai(a;) 


fojx) 

/i(^) 


(59) 


There  exists  I  >  0  such  that 


^  ^  ..  [ai(x)tl;o{x)  -  ao{x)tlJi{x)]^ 

{<p2{x)  -  Mw  >  -  X - - 

n  ttg  (a:  j 


This  completes  the  proof  of  Lemma  2.3. 


(60) 


Proof  of  (31)  and  (32).  To  prove  (31),  it  is  sufficient  to  show  that 


Po  +  ^  + - 1" 


m+1 


Po  I  Pi  _i_  •  •  -I- 

2  -r  3  T  -r 


=  1 
=  0 


(61) 


_Po _ I _ Pi—  4-  .  .  .  _j_  .  Prn  —  n 

^  772-f-l  ?71-(-2  2j71-{-1 


Then  we  need  to  show  that  Ps  {0  <  s  <  m)  is  the  solution  of  (61).  Using  Cramer’s  rule,  we 
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have,  for  0  <  5  <  m, 


1 


1 


1 

m-fl 


1 

2 


0 


1 

771+2 


Ps  = 


m+1.. 


5+1 

1 

5+2 


. 27n+j.. 


m+1 

1 

m+2 


m+1  5+m+l  2m+l 

where  det2  is  the  numerator  of  Ps  and  deti  is  the  denominator 


det2 
deti  ’ 


of  Ps-  A  simple  calculation 


shows  that 


and 


Thus 


deti  — 


[m!(m  —  1)!  •  •  •  2!]’ 


det2  = 


{m  +  l)!(m  +  2)!  •  •  •  (2m  +  1)! 

(_l)s+2(y^!)2[(j^  —  l)!(m  —  2)!  •  •  •2!]®(s  +  m  +  1)! 
(m  +  2)!(m  +  3)!  •  •  •  (2m  +  l)!(s  +  l)!s!(m  —  s)! 

(— l)*(m  +  l)(m  +  s  +  1)! 

Ps  = 


(s  +  l)!s!(m  —  s)! 

So  (31)  is  proved.  The  proof  of  (32)  is  similar.  It  is  omitted  here. 


Proof  of  Lemma  3.1.  We  prove  (44)  only.  The  proof  of  (45)  is  similar.  Let  do  = 
1^1 1  V  102 1-  In  Case  I,  using  Taylor  expansion  and  (31). 


uh{X^)  ^ 

=  [  c(e)e^^dG{9)  +  u^  f  c{e)e^^[f^  Kov{t)e^^^^^^dt]dG{9), 

Jq  Jn  Jo  v\ 


(62) 


where  t*  G  (0,1).  Note  that 


Kov{t)e^^^'  ^dt  <  \Kov{t)\dt 

Jo  v\  Jo 


(63) 
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Then 


1B[q„(i)|  -  aa(x)\  <  e“>y\"o<,(x)  (64) 

Note  that  for  a;  <  0,  0  <  t  <  1,  h{x  +  ut)  >  h{x)  and 

Similarly,  Var[an{x)]  <  e^^poacix) /[nuh{x)]  for  rr  >  0.  Then  (44)  is  proved. 


Proof  of  Lemma  3.3.  We  prove  (46)  for  different  cases  of  (*). 

Case  1.  Let  rj  satisfy  0i  —  r]  G  Q,  and  62  +  p  G  Cl.  For  any  6  g\0i  —  r],  O2  +  r]],  f{x\0)  is 
bounded  on  (—00, 00).  For  any  {9,  x)  G  [^i  —  ^2  +  ^]  x  [0, 00),  we  have 


f{x\d)  —  c{6)  exp{0x)h{x) 


(66) 


<  c{9)  exp ((02  +  r])x)h{x) 

c{0) 


C(02  +  V) 

For  any  (0,  x)  G  [0i  -  77, 02  +  ??]  x  (-00, 0),  we  have 

c(0) 


/(a;|0  =  02  +  7?). 


f{x\0)  < 


/(a;|0  =  0i-77). 


(67) 


c(0i  -  77)  • 

Since  c(9)  is  a  convex  function  on  [0i  —77,02  +  77],  it  follows  from  (66)  and  (67)  that  there 
exists  M  >  1  such  that  for  any  (0,  x)  G  [0i  —  77, 02  +  77]  x  (—00, 00) 


f(xl9)  <  M.  (68) 

Let  c  =  maxee[ei,e2]{^|S^,  V  1.  Then  c  <  00.  And 

j[fG{x)\^-^dx  =  j"[j  f{x\e)dGm-^dx  +  f_jj  fixmcm'^-^dx  (69) 

<  c  j  [J  c{6  +  r])  exp((0  +  rj)x)h{x)dG{9)y'~'^  exp(— 77(1  —  r)x)dx 
+c  j  [J  c{6  —  77)  exp((0  —  r))x)h{x)dG{d)y'~'^  exp(77(l  —  r)x)dx 
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roo  rO 

<  cM[  ex.p{—r]{l  —  r)x)dx+  ex.p{r){l  —  r)x)dx] 

Jo  J —oo 

2cM 

77(1  -  r) 

Case  III.  Note  that  there  exists  M  >  1  such  that  for  any  9  G  [^1,  ^2]  and  for  any  x  G  (0, 1) 

f{x\9)  <  M.  (70) 


Then 

[\fG{x)Y~''dx  =  t  M^-'^dx  <  M.  (71) 

7o 

Case  II.  Note  that 

r\SG(x)r’dx=  !\fa(x)r’dx+  r[h(x)r^dx.  (72) 

Jo  o'O  Ji 

Then  Lemma  3.3  in  this  case  follows  the  methods  used  in  the  proofs  for  Case  I  and  Case  III. 


6  Summary  and  Discussion 

In  this  paper,  we  have  studied  the  estimation  problem  in  the  exponential  family.  First  we 
proved  that  the  best  possible  rate  of  empirical  Bayes  estimators  is  0(l/n)  if  6  is  distributed 
within  a  bounded  compact  set.  This  gives  a  goal  that  we  are  working  toward  in  constructing 
the  empirical  Bayes  estimators.  For  a  long  time,  people  have  been  thinking  that  0(l/n)  is 
a  natural  lower  bound  rate.  But  it  had  never  been  proved. 

Also  we  have  constructed  an  estimator  which  achieves  a  rate  of  0(n“^(lnn)^(lnlnn)^) 
under  the  assumption  that  0  is  distributed  within  a  bounded  compact  set.  Under  the  same 
assumption,  this  is  the  fastest  rate  comparing  to  the  rates  that  have  appeared  in  the  literature 
before. 

Most  recent  significant  results  on  this  estimation  problem  are  published  by  Singh  (1979) 
and  Pensky  (1998).  In  their  papers,  they  constructed  the  empirical  Bayes  estimators  and 
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investigated  the  convergence  rate  of  the  estimators.  Singh  (1979)  used  the  kernel  method 
and  Pensky  (1998)  applied  the  advanced  wavelet  techniques  in  their  construction.  Both 
papers  allow  the  unboundedness  of  6  but  get  a  polynomial  rate.  For  Singh’s  result,  the  rate 
will  stay  the  same  even  under  additional  assumption  that  Q  is  a  compact  set.  So  our  result 
is  much  better  than  his  under  the  same  assumption.  To  get  a  rate  like  we  have  here  from 
Pensky’s  result,  the  existence  of  all  moments  of  9  is  necessary.  Also  the  degree  of  smoothness 
of  f{x\$)  is  a  key  factor  to  determine  the  rate  of  convergence.  If  the  degree  of  smoothness  is 
low,  the  rate  is  slow  even  if  9  is  distributed  within  a  compact  set. 
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